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Optimal Achievable Rates for 
Interference Networks with Random Codes 

Bemd Bandemer, Abbas El Gamal, and Young-Han Kim 

Abstract 

The optimal rate region for interference networks is characterized when encoding is restricted to random code 
ensembles with superposition coding and time sharing. A simple simultaneous nonunique decoding rule, under 
which each receiver decodes for the intended message as well as the interfering messages, is shown to achieve this 
optimal rate region regardless of the relative strengths of signal, interference, and noise. This result impUes that the 
Han-Kobayashi bound, the best known inner bound on the capacity region of the two-user-pair interference channel, 
cannot be improved merely by using the optimal maximum likelihood decoder. 

Index Terms 

network information theory, interference network, superposition coding, 
maximum likelihood decoding, simultaneous decoding, Han-Kobayashi bound. 

I. Introduction 

Consider a communication scenario in which multiple senders communicate independent messages to multiple 
receivers over a network with interference. What is the set of simultaneously achievable rate tuples for reliable 
communication? What coding scheme achieves this capacity regionl Answering these questions involves joint 
optimization of the encoding and decoding functions, which has remained elusive even for the case of two 
sender-receiver pairs. 

With a complete theory in terra incognita, in this paper we take a simpler modular approach to these questions. 
Instead of searching for the optimal encoding functions, suppose rather that the encoding functions are restricted to 
realizations of a given random code ensemble of a certain structure. What is the set of simultaneously achievable 
rate tuples so that the probability of decoding error, when averaged over the random code ensemble, can be made 
arbitrarily small? To be specific, we focus on random code ensembles with superposition coding and time sharing of 
independent and identically distributed (i.i.d.) codewords. This class of random code ensembles includes those used 
in the celebrated Han-Kobayashi coding scheme [13]. 
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We characterize the set of rate tuples achievable by the random code ensemble for an interference network 
as the intersection of rate regions for its component multiple access channels in which each receiver recovers its 
intended messages as well as appropriately chosen unintended messages. More specifically, the rate region S^* for 
the interference network with senders [1 : -^i"] = {1,2,..., K^, each communicating an independent message, and 
receivers [1 : i], each required to recover a subset Di, . . . ,1)^ C [1 : of messages, is 



channel with senders <S and receiver I when the codewords from the other senders [1 : JC] \ <S are treated as random 
noise. 

A direct approach to proving this result would be to analyze the average performance of the optimal decoding 
rule for each realization of the random code ensemble that minimizes the probability of decoding error, namely, 
maximum likehhood decoding (MLD). This analysis, however, is unnecessarily cumbersome. We instead take an 
indirect yet conducive approach that is common in information theory. First, we show that any rate tuple inside ^* 
is achieved by using the typicaUty-based simultaneous nonunique decoding (SND) rule [7], [10], [18], in which each 
receiver attempts to recover the codewords from its intended senders and (potentially nonuniquely) the codewords 
from interfering senders. Second, we show that if the average probability of error of MLD for the random code 
ensemble is asymptotically zero, then its rate tuple must he in The key to proving the second step is to show that 
after a maximal set of messages has been recovered, the remaining signal at each receiver is distributed essentially 
independently and identically. The two-step approach taken here is reminiscent of the random coding proof for the 
capacity of the point-to-point channel [21], wherein a suboptimal (in the sense of the probabiUty of error) decoding 
rule based on the notion of joint typicality can achieve the same rate as MLD when used for random code ensembles. 

Our result has several imphcations. 

• It shows that incorporating the structure of interference into decoding, when properly done as in MLD and SND, 
always achieves higher or equal rates compared to treating interference as random noise; thus, the traditional 
wisdom of distinguishing between decoding for the interference at high signal-to-noise ratio and ignoring the 
interference at low signal-to-noise ratio does not provide any improvement on achievable rates. 

• It shows that the Han-Kobayashi inner bound [13], [7], [10, Theorem 6.4], which was estabhshed using the 
random code ensemble and a typicaUty-based simultaneous decoding rule, cannot be improved by using a more 
powerful decoding rule such as MLD. 

• It generalizes the result by Motahari and Khandani [17], and BacceUi, El Gamal, and Tse [2] on the optimal 
rate region for ii'-user-pair Gaussian interference channels with point-to-point Gaussian random code ensembles 
to arbitrary (not necessarily Gaussian) random code ensembles with time sharing and superposition coding. 

• It shows that the Cover-van der Meulen inner bound with no common auxiliary random variable on the capacity 
region of the two-receiver broadcast channel [9], [23], [10, Eq. (8.8)] (and thus Marton's inner bound [16], [10, 
Theorem 8.3]) can be improved by using SND to include the superposition coding inner bound [8], [4], [10, 
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Theorem 5.1]. 

• It shows that the interference decoding rate region for the three-user-pair deterministic interference channel 

in [3] is the optimal rate region achievable by point-to-point random code ensembles and time sharing. 
We illustrate the main result and its implications via the following two simple examples. 

A. Interference Channels with Two User Pairs 

Consider the two-user-pair discrete memoryless interference channel (2-DM-IC) p{yi,y2\xi, X2) with input 
alphabets Xi and X2 and output alphabets J^i and depicted in Figure 1. Here sender j = 1,2 wishes to 
communicate a message to its respective receiver via n transmissions over the shared interference channel. Each 
message Alj, j = 1, 2, is separately encoded into a codeword X" = {Xji,Xj2, ■ ■ ■ , Xjn) and transmitted over the 
channel. Upon receiving the sequence Y", receiver j — 1,2 finds an estimate Mj of the message Mj. 



M2 X^ 











p{yi,y2\xi,x2) 











17 Ml 
Yi' M2 



Figure 1. Two-user-pair discrete memoryless interference channel. 



We now consider the standard random coding analysis for inner bounds on the set of achievable rate pairs (the 
capacity region) of the 2-DM-IC. Given a product input pmf p{xi) p{x2), suppose that the codewords x'j{mj), 
rrij e [1 : 2"^^] = {1,2,..., 2"^^ }, for j = 1, 2 are generated randomly, each drawn according to n"=i PXj {xji)- 

We recall the rate regions achieved by employing the following simple suboptimal decoding rules, described for 
receiver 1 (cf [10, Sec. 6.2]). 

• Treating interference as noise (IAN). Receiver 1 finds the unique message rhi such that (^"(toi), j/") is jointly 
typical. (See the end of this section for the definition of joint typicality.) It can be shown that the average 
probability of decoding error for receiver 1 tends to zero as n — > 00 if 

Ri < IiXi;Yi). (2) 

The corresponding rate region (IAN region) is depicted in Figure 2(a). 

• Simultaneous decoding (SD). Receiver 1 finds the unique message pair (mi, 7712) such that (x" (mi), X2 (m2), y") 
is jointly typical. The average probability of decoding error for receiver 1 tends to zero as n — > cx) if 

Ri<I{Xi;Yi\X2), (3a) 

R2<I{X2;Yi\Xi), (3b) 

Ri+R2<I{Xi,X2;Yi). (3c) 
The corresponding rate region (SD region) is depicted in Figure 2(b). 
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Now, consider simultaneous nonunique decoding (SND) in which receiver 1 finds the unique rhi such that 
{xi{mi), a;5(™2), y" ) is jointly typical /or some m2. Clearly, any rate pair in the SD rate region (3) is achievable via 
SND. Less obviously, any rate pair in the IAN region (2) is also achievable via SND as we show in the achievability 
proof of Theorem 1 in Section II. Hence, SND can achieve any rate pair in the union of the IAN and SD regions, that 
is, the rate region as depicted in Figure 2(c). Similarly, the average probability of decoding error for receiver 2 
using SND tends to zero as n — > oo if the rate pair R2) is in ^2, which is defined analogously by exchanging 
the roles of the two users (see Figure 2(d)). Combining the decoding requirements for both receivers yields the rate 
region ^1 n ^2- 

This rate region ^1 n ^2 turns out to be optimal for the given random code ensemble. As shown in the converse 
proof of Theorem 1, if the probability of error for MLD averaged over the random code ensemble tends to zero as 
n — > 00, then the rate pair R2) must reside inside the closure of ^1 n ^2- Thus, SND achieves the same rate 
region as MLD (for random code ensembles of the given structure). 



i?2 




i?2 



I{X2;Yi\Xi) 



■Ri 



(a) 



45° 



-^Ri 

I{X^;Yr) /(XiiFilXa) 
(c) 



I{X2;Yi\Xi 




/(Xi;Yi) I{Xi;Yi\X2) 
(b) 




I{XuY2\X2) 

(d) 



Figure 2. Achievable rate regions for the 2-DM-IC: (a) treating interference as noise, (b) using simultaneous decoding, (c) using 
simultaneous nonunique decoding (,^1); note that ^1 is the union of the regions in (a) and (b); and (d) using simultaneous 
nonunique decoding at receiver 2 (^2). 



B. Broadcast Channels with Two Receivers 

In the previous example, the random code ensemble for each sender had the structure of random code ensembles 
for point-to-point communication channels [21]. To illustrate our result for superposition coding, consider the 
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two-receiver discrete memoryless broadcast channel (2-DM-BC) p{yi,y2\x) with input alphabet X and output 
alphabets and 3^2- Here the sender wishes to communicate two independent messages to their respective receivers 
via n transmissions over the broadcast channel. Each message pair (Mi,M2) is encoded into a codeword X" 
and transmitted over the channel. Upon receiving the sequence Y", receiver j = 1, 2 finds an estimate Mj of the 
message Mj. 

We consider a special case of the classical coding scheme by Cover [9] and van der Meulen [23], illustrated 
in Figure 3. Given a product pmf p{ui) p{u2) and a function x{ui,U2), suppose that the codewords a;"(TOi, m2), 
(mi,TO2) G [1 : 2"-''i] x [1 : 2"^^^], are given as Xi{mi,m2) = x{uii{mi) , U2i{m2)), i £ [1 : n], where the sequences 
u"(mj), TTij e [1 : 2"^j], for j — 1,2 are generated randomly, each drawn according to Y[i=iPUj{uji)- Thus, the 
transmitted codeword is a "superposition" of two codewords ii"(r7ii) and ^5(^2), which is literally the case when 
x{ui,U2) is additive. 



M2 C/2" 



p(yi-,y2\x) 



17 ^ Ml 



K" M2 



Figure 3. Broadcast channel with Cover-van der Meulen coding. 



Alternatively, this superposition coding scheme can be viewed as first transforming the underlying the broadcast 
channel into a two-user-pair interference channel 

p(yi,2/2|wi,M2) = p{yi,y2\x{ui,u2)) 

and then applying the random coding scheme for two-user-pair interference channel discussed in Subsection I-A. 
Hence, the random coding analysis thereof can be readily applied. For example, suppose that each receiver decodes 
for its intended codeword while treating the other codeword as noise (cf (2)). Then it can be shown that the average 
probability of decoding error tends to zero as n — > 00 if 

i?i </(C/i;ri), (4a) 
R2<I{U2;Y2). (4b) 

Taking the union over all pmfs p{ui)p{u2) and functions x{ui,U2), we obtain the Cover-van der Meulen inner 
bound (with no common auxiliary random variable) on the capacity region [10, Eq. (8.8)]. 

On the other hand, consider the superposition coding inner bound on the capacity region [8], [4], [10, Theorem 5.1], 
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which is the set of rate pairs such that 



i?i </(C/i;Yi|C/2) 



I{X;Y,\U2), 



(5a) 



i?2 </(f/2;i2), 



(5b) 



Ri + R2<I{Ui,U2;Yi) 



(5c) 



for some pmf p{ui) p{u2) and function x{u\,U2). This inner bound corresponds to having receiver 1 decode for 
both messages while receiver 2 treats the other codeword as noise. It can be shown [12] that this bound is not in 
general contained in the Cover-van der Meulen inner bound and neither vice versa. (This statement remains true 
even if the Cover-van der Meulen inner bound is replaced with Marton's inner bound without a common auxiUary 
random variable [16], [10, Theorem 8.3]). 

The distinction between the superposition coding inner boimd and the Cover-van der Meulen inner bound is, 
however, a mere side effect from the use of suboptimal decoding rules. Suppose now that both receivers use SND. 
As in Subsection I- A, the average probabihty of decoding error tends to zero as n ^ oo if R2) € ^1 fl 3^2, 
where ^\ consists of rate pairs such that 



and .'SP,2 is similarly defined by exchanging the subscripts 1 and 2. The union of .^1 n ^2 over all pmfs p{ui) p{u2) 
and functions x{ui, U2) yields an inner bound on the capacity region. It is not hard to see that this region includes 
both inner bounds (4) and (5). Furthermore, this region is the optimal rate region achieved by using MLD (see 
Section 111). 

The rest of the paper is organized as follows. For simplicity of presentation, in Section II we formally define the 
problem for the two-user-pair interference charmel and establish our main result for the random code ensemble with 
time sharing and no superposition coding. In Section HI, we extend our result to a multiple-sender multiple-receiver 
discrete memoryless interference network in which each sender has a single message and wishes to communicate it 
to a subset of the receivers. This extension includes superposition coding with an arbitrary number of layers. In 
Section IV, we specialize the result to the Han-Kobayashi coding scheme for the two-user-pair interference channel. 
Most technical proofs are deferred to the Appendices. 

Throughout we closely follow the notation in [10]. In particular, for X ~ p(x) and e e (0, 1), we define the set 
of £-typical n-sequences a;" (or the typical set in short) [20] as 



Rt<I{Ut;Yt) 



or 



Rt < I{Ut;Yt\U2) 



Ri + R2<I{Ui,U2;Y^) 



7;("H^) = {a;" : |#{«: Xi = x}/n-p{x)\ < ep{x) for all a; e X}. 
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For a tuple of random variables {Xi,. . . ,Xk), the joint typical set Te {Xi, . . . ,Xk) is defined as the typical set 
Te"\{Xi, . . . , Xh)) for a single random variable {Xi, . . . ,Xk). The joint typical set Ts"^ {Xs ) for a subtuple 
Xs = {Xk : fc e «S) is defined similarly for each <S C [1 : fc]. We use 6{s) > to denote a generic function of 
e > that tends to zero as £ — >^ 0. Similarly, we use £„ > to denote a generic function of n that tends to zero as 
n — )• oo. 

II. DM-IC WITH Two User Pairs 

Consider the two-user-pair discrete memoryless interference channel (2-DM-IC) p{yi,y2\xi,X2) introduced in 
Subsection I-A (see Figure 1). A (2"^i,2"^^n) code C„ for the 2-DM-IC consists of 

• two message sets [1 : 2"^^] and [1 : 2"^=], 

• two encoders, where encoder 1 assigns a codeword a;" (mi) to each message mi e [1 : 2"^^] and encoder 2 
assigns a codeword (m2) to each message m2 S [1 : 2"^^], and 

• two decoders, where decoder 1 assigns an estimate mi or an error message e to each received sequence y" 
and decoder 2 assigns an estimate m2 or an error message e to each received sequence 1/2 • 

We assume that the message pair (Mi,M2) is uniformly distributed over [1 : 2"^^] x [1 : 2"^=] xhe average 
probability of error for the code C„ is defined as 

Pi")(C„) = P{(Mi,M2) ^ (Mi,M2)}. 

A rate pair (i?i,i?2) is said to be achievable for the 2-DM-IC if there exists a sequence of (2"^^ , 2"^^ , n) codes 
Cn such that lim„_>oo Pe^\Cn) = 0. The capacity region of the 2-DM-IC is the closure of the set of achievable 
rate pairs (i?i, -R2)- 

We now limit our attention to a randomly generated code ensemble with a special structure. Let p = p{q, Xi, X2) = 
p{q) p{xi\q) p{x2\q) be a given pmf on Q x A'l x X2, where Q is a finite alphabet. Suppose that the codewords 
X"(toi), mi e [1 : 2"^i], and X2{m2), m2 & [1 : 2"^^], that constitute the codebook, are generated randomly as 
follows: 

• LetQ"~nr=i^'Qfe)- 

• Let ^"(mi) ^ Yli^iPXi\Q{xii\qi), mi e [1 : 2"^^], conditionally independent given Q". 

• Let XJ(m2) Yli=i Px2\Qix2i\qi), m2 e [1 : 2"^^], conditionally independent given Q". 

Each instance {(a;" (mi), a;2 (m2)) : (mi,m2) e [1 : 2"^^] x [1 : 2"^=]} of such generated codebooks, along with 
the corresponding optimal decoders, constitutes a (2"^i , 2"^^ , n) code. We refer to the random code ensemble 
generated in this manner as the (2"^^ , 2"^^ , n; p) random code ensemble. 

Definition 1 (Random coding optimal rate region). Given a pmf p = p(q) p{xi\q) p{x2\q), the optimal rate region 
^%*{p) achievable by the p-dlstrlbuted random code ensemble is the closure of the set of rate pairs (i?i,i?2) such 
that the sequence of (2"^'^,2"^''2,n;p) random code ensembles C„ satisfies 

lim Ec„[Pi")(C„)]=0, 
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where the expectation is with respect to the random code ensemble C„. 

To characterize the random coding optimal rate region, we define ^i{p) to be the set of rate pairs {R\,R2) such 
that 

Rr<I{Xr;Y^\Q) (6a) 

or 

R2<I{X2;Y^\X^,Q), (6b) 

Ri + R2< I{Xi,X2; Fi I Q). (6c) 

Similarly, define ^2{p) by making the index substitution 1 o 2. We are now ready to state the main result of the 
section. 

Theorem 1. Given a pmf p = p{q)p{xi\q)p{x2\q), the optimal rate region of the DM-IC p{yi,y2\xi, X2) achievable 
by the p-distributed random code ensemble is 

^*(p)=^i(p)n^2(p). 

Before we prove the theorem, we point out a few important properties of the random coding optimal rate region. 
Remark 1 (MAC form). Let ^i,ian(p) be the set of rate pairs {Ri,R2) such that 

R,<I{X,;Y,\Q), 

that is, the achievable rate (region) for the point-to-point channel p{yi\xi) by treating the interfering signal X2 as 
noise. Let ^i,sd(p) be the set of rate pairs i?2) such that 

Ri<I{Xi;Y,\X2,Q), 

R2<I{X2;Y,\X,,Q), 

Ri+R2<IiXr,X2;Y,\Q), 

that is, the achievable rate region for the multiple access channel p{y i\xi,X2) by decoding for both messages Mi 
and M2 simultaneously. Then, we can express ^i{p) as 

^i{p) = ^i,ian(p) U ^i,sd(p), 

which is referred to as the MAC form of ^i{p), since it is the union of the achievable rate regions of 1-sender and 
2-sender multiple access channels. The region ^2{p) can be expressed similarly as the union of the interference-as- 
noise region ^2,ian(p) and the simultaneous-decoding region ^2,sd{p)- Hence the optimal rate region ^*{p) can 
be expressed as 

= (^1,IAn(p) n^2,IAN(p)) U (^1,IAn(p) n^2,SD(p)) 

U (^i,sd(p) n ^2,ian(p)) U (^i,sd(p) n ^2,sd(p)) , (7) 
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which is achieved by taking the union over all possible combinations of treating interference as noise and simultaneous 
decoding at the two receivers. 

Remark 2 (Min form). The region ^\{p) in (6) can be equivalently characterized as the set of rate pairs (i?i,i?2) 
such that 

Ri<I{Xr,Yi\X2,Q), (8a) 

i?i + min{i?2, /(^2; yi I Xi, Q)} < /(Xi, X2; Fi | Q). (8b) 

The minimum term in (8b) can be interpreted as the effective rate of the interfering signal X2 at the receiver Yi, 
which is a monotone increasing function of R2 and saturates at the maximum possible rate for distinguishing X2 
codewords; see [3]. When R2 is small, all X2 codewords are distinguishable and the effective rate equals the actual 
code rate. In comparison, when R2 is large, the codewords are not distinguishable and the effective rate equals 
/(X2; Yi I Xi, Q), which is the maximum achievable rate for the channel from X2 to Yi. 

Remark 3 (Nonconvexity). The random coding optimal rate region ^*{p) is not convex in general. This is exemplified 
by the deterministic 2-DM-IC in Figure 4. 



R2 




(a) Channel block diagram. (b) Regions .^i(p), ,^2(p), and {p) for 

g = and Xi,X2 ~ Umf[0:3]. 

Figure 4. An example for nonconvex .^*(p). 

A direct approach to proving Theorem 1 would be to analyze the performance of maximum likelihood decoding: 

1 " 

mi = argmax^^^]^pY^|x^^X2(yn |a;H(TOi),a;2i(TO2)), 

1 " 

7712 = &ygumyi^^^^\\_pY^\Xi,X2{y2i\xu(mi),X2i{m2)) 

nil i—l 

for the p-distributed random code. Instead of performing this analysis, which is quite complicated (if possible), 
we establish the achievability of ^* {p) by the suboptimal simultaneous nonunique decoding rule, which uses the 
notion of joint typicality. We then show that if the average probability of error of the (2"-^i , 2"^2 , 77; p) random 
code ensemble tends to zero as 71 — > 00, then the rate pair (i?i, R2) must lie in ^*{p). 
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A. Proof of Achievability 

Each receiver uses simultaneous nonunique decoding. Receiver 1 declares that mi is sent if it is the unique 
message among [1 : 2"^^] such that 

(g",a;5'(rni),:r^(m2),yr) e T,'") for some ma € [1 : 2"^^]. 

If there is no such message or more than one, it declares an error. Similarly, receiver 2 finds the unique message 
7712 e [1 : 2"^2] such that 

(<Z", .<(mi), x^(m2), y^) G T.^") for some mi e [1 : 2"^^]. 

To analyze the probability of decoding error averaged over the random codebook ensemble, assume without loss of 
generality that (Mi, M2) = (1, 1) is sent. Receiver 1 makes an error only if one or both of the following events 
occur: 

fi = {(Q",xr(i),X2"(i), Yi") i ri")}, 

£2 = {((5",Xi"(mi),XJ(m2),yi") e for some mi 7^ 1 and some ma}. 

By the law of large numbers, P(£i) tends to zero as n ^ 00. 

We bound P{£2) in two ways, which leads to the MAC form of ^i(p) in Remark 1. First, since the joint 
typicality of the quadruple (Q", -'^"(mi), X2 (ma), Yi") for each ma implies the joint typicality of the triple 
(Q",Xf (mi),yi"), we have 

£2 C {(g",Xi"(mi),yi") e ri") for some mi ^ l} = S^. 

Then, by the packing lemma in [10, Section 3.2], P{£2) tends to zero as n ^ 00 if 

Ri<I{Xr,Y,\Q)-6{e). (9) 

The second way to boimd P(£^2) is to partition £2 into the two events 

£21 = {(g",Xi"(mi),X2"(l),yi") € r,(") for some mi ^ l}, 
£22 = {(g",Xr(mi),Xa"(m2),ir) G for some mi ^ 1, ma ^ l}. 
Again by the packing lennma, P(^2i) and P{£22) tend to zero as n ^ 00 if 

R,<I(X,;Y,\X2,Q)-S{e), (10a) 
Ri+R2<I{XuX2;Yi\Q)-S{e). (10b) 

Thus we have shown that the average probability of decoding error at receiver 1 tends to zero as n 00 if at least one 
of (9) and (10) holds. Similarly, we can show that the average probability of decoding error at receiver 2 tends to zero 
as n ^ 00 if i?2 < I{X2; I2 1 Q) - S{e), or R2 < /(X2; F2 | ^1, Q) - S{e) and i?i + i?2 < /(Xi, X2; F2 | Q) - S{e). 
Since £ > is arbitrary and 6{e) — )• as £ — )• 0, this completes the proof of achievability for any rate pair (i?i, i?2) 
in the interior of ^1 (p) n ^2 {p) ■ □ 



11 



Remark 4 (Comparison to maximum likelihood decoding). It is instructive to consider the following progression of 
decoding rules for receiver 1. 

1) Maximum likeUhood decoding: 

rhi = argin£ixp(y" | mi) 

mi 

= argmax^^^p(y^'|mi,m2) (11) 

1 " 

= argmax ^W_PYi\XuxAyu \ xuimi), X2i{m2)), 

mi t=\ 

which is the optimal decoding rule. 

2) Simultaneous maximum UkeUhood decoding: 

mi = argmaxmaxp(y" |mi,m2), 

mi m2 

which is equivalent to performing optimal decoding of the message pair (Mi, M2) and then taking the first 
coordinate. Note the maximum over m2 instead of the average as in (11). 

3) Typicality score decoding: 

mi = argmin mill £*(?/", mi, m2), 

mi m2 

where £*(y",mi,m2) is defined as the smallest e such that 

(g",ar?(mi),a;^(m2),2/ner/"). 

Here the notion of joint typicality plays the role of Ukelihood in previous decoding rules and e* captures the 
penalty for being atypical. 

4) Simultaneous nonunique decoding: Find the unique mi such that 

{(t, (mi), 0:2 (m2), y") e 7;^"^ for some m2. 

This is equivalent to performing typicality score decoding with predetermined threshold e for e*(t/",mi,m2); 

thus first forming a list of all (mi, m2) for which e*(y", mi, m2) < £, and then taking the first coordinate of 

the members of the Ust (if it is unique). 
Starting from the optimal maximum likelihood decoding rule, each subsequent rule modifies its predecessor by 
"relaxing" one step. Nonetheless, these relaxation steps do not result in any significant loss in performance, as is 
evident in the rate-optimaUty of the simultaneous nonunique decoding rule. 

Remark 5. As observed in [5] (see also (7) in Remark 1 above), each rate point in ^*{p) can alternatively be 
achieved by having each receiver specifically decode for either the desired message alone or both the desired and 
interfering messages. 
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B. Proof of the Converse 

Fix a pmf p = p{q) p{xi\q) p{x2\q) and let (iii, i?2) be a rate pair achievable by the p-distributed random code en- 
semble. We prove that this implies that (J?i, i?2) € ^i{p)r\^2{p) as claimed. Here, we show the details for the inclu- 
sion {Ri,R2) G ^i(p); the proof for {Ri, R2) € ^2{p) follows similarly. With slight abuse of notation, let C„ denote 
the random codebook (and the time sharing sequence), namely (Q", (1), . . . , Xp(2"^i ), . . . , XJ(2"^2)). 

First consider a fixed codebook C„ = c. By Fano's inequality, 

if (Ml I Fi", C„ = c) < 1 + ni?iPi") (c). 
Taking the expectation over the random codebook C„, it follows that 

H{M^ I Y{^,Cn) < 1 + ni?i Ec„[Pi")(Cn)] < n£„, (12) 

where £„ as n ^ 00 since Ec„[Pi"^(C„)] — > 0. 

We prove the conditions in the min form (8). To see that the first inequaUty is true, note that 

n{Ri - £„) = H{Mi I C„) - n£„ 

(a) 

</(Mi;l^''|C„) 
<I{X^;Y^\Cn) 
<I{X^;Y^,X^\Cn) 
= I{X^-Y^\X^,Cn) 

= H{Y^ I c„) - I xr, C„) 
< if(yi" I 0") - iJ(Fi" I , 0") 

*='n/(Xi;yi|X2,Q), 

where (a) follows by (the averaged version of) Fano's inequahty in (12), (b) follows by omitting some conditioning 
and using the memoryless property of the channel, and (c) follows since the tuple {Qi,Xii,X2i,Yi) is i.i.d. for 
all i. Note that unlike conventional converse proofs where nothing can be assumed about the codebook structure, 
here we can take advantage of the properties of a given codebook generation procedure. 

To prove the second inequality in (8), we need the following lemma, which is proved in Appendix A. 

Lemma 1. 

lim -H(X^ I X^,Cn) = H{Yi I X,,X2, Q) + min{i?2, \ X^Q)}. 

n— >oC' 77, 

The lemma states that depending on i?2, {l/n)H{Y^ \ X^,Cn) either tends to H{Yi \ Xi,Q), that is, the remaining 
received sequence after recovering the desired codeword looks hke i.i.d. noise, or to R2 + H{Yi \ Xi,X2, Q), that 
is, the receiver can distinguish the interfering codeword from the noise. 
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Equipped with this lemma, we have 

n(i?i-£„)</(^r;>Tlc„) 

= H{Yl'\C^)~H{Y^\X^,Cn) 
<H{Yl'\Q'')-H{Y^\X^,Cr,) 

(b) 

< nH{Yi I Q) - nH(Yi \Xi,X2,Q)- min{ni?2, ^/(Xa; Fi \Xi,Q)} + n£„ 

= nI{Xi,X2;Yi \ Q) + min{ni?2, nI{X2] Yi | Xi, g)} + ne„. 

Here, (a) follows by Fano's inequality and (b) follows by Lemma 1 with some e„ that tends to zero as n — ?> oo. 
The conditions for i%2{p) can be proved similarly. This completes the proof of the converse. 

III. DM-IN WITH K Senders and L Receivers 

We generalize the previous result to the iiT-sender, L-receiver discrete memoryless interference network ({K, L)- 
DM-IN) with input alphabets Xi, . . . , Xk, output alphabets J^i, . . . , y^, and pmfs p{yi, . . . , | xi, . . . , xk)- In 
this network, each sender k £ [1 : K] communicates an independent message at rate and each receiver 
Z e [1 : i] wishes to recover the messages sent by a subset 2?z C [1 : K] of senders (also referred to as a demand 
set). The channel is depicted in Figure 5. 

Y^ ^ {Mfei, fc e 2?i} 

Figure 5. Discrete memoryless interference network with K senders and L receivers. 

More formally, a (2"-^i , . . . , 2"^* , n) code C„ for the {K, L) -DM-IN consists of 

• K message sets [1 : 2"^^], . . . , [1 : 2"^^], 

• K encoders, where encoder k E [i- ■ K] assigns a codeword a;JJ(TO/j) to each message rn^ e [1 : 2"^*=], 

• L decoders, where decoder I E ■ L] assigns estimates rhki, fc G I?;, or an error message e to each received 
sequence y". 

We assume that the message tuple {Mi, . . . , M^) is uniformly distributed over [1 : 2"^^] x • • • x [1 : 2"^^]. The 
average probability of error for the code C„ is defined as 

Pi")(C„) = P{Mki ^ Mk for some I E [1 : L], k E Vi} . 

A rate tuple . . . , Rk) is said to be achievable for the DM-IN if there exists a sequence of (2"^^ , • ■ • , 2"^^^ , n) 
codes Cn such that lim„^oo Pe^\Cn) — 0. The capacity region 'if of the (i^T, i)-DM-IN is the closure of the set of 
achievable rate tuples (i?i, . . . , Rk)- 



Ah 
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X^ 



X^ 



Mk XI 
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As in Section II, we limit our attention to a randomly generated code ensemble with a special structure. Let 
P — vil) p{^i\(l) ■ ■ 'P{xk\(i) be a given pmf on Q x Afi x • • • x Xk, where Q is a finite alphabet. Suppose that 
codewords X'^{m.k), nik G [1 : 2"^*=], k E [1 : K], are generated randomly as follows. 

. LetQ"^nr=iPQ('?0- 

• For each fee [1 : -ft^] and mj, e [1 : 2"^''], let XJ^{mk) ^ Yl'i=iPXk\Qi^ki\Qi), conditionally independent given 
Q". 

Each instance of codebooks generated in this manner, along with the corresponding optimal decoders, constitutes 
a (2"^i , ■ • • , 2"^^ , n) code. We refer to the random code ensemble thus generated as the (2"^^ , ■ • ■ , 2"^^ ,n;p) 
random code ensemble. 

Definition 2 (Random coding optimal rate region). Given a pmf p = p{q) p{xi\q) ■ ■ ■ p{xK\q), the optimal rate region 
^*{p) achievable by the p-distributed random code ensemble is the closure of the set of rate tuples [Ri, . . . , Rr) 
such that the sequence of the (2"^^ , • • ■ , 2"-'^^ , n; p) random code ensembles C„ satisfies 

lim Ec„[Pi")(C„)] =0, 

?i— >-oo 

where the expectation is with respect to the random code ensemble C„. 

Note that the setup discussed in Section II as well as the broadcast channel example in Subsection I-B correspond to 
the special case of K = L = 2 and demand sets 2?i = {1} and 2?2 = {2}- More generally, the p-distributed random 
code ensemble for the [K^ L)-DM-IN captures superposition coding with an arbitrary number of layers. Suppose 
that there are K senders, some of which need to communicate multiple messages (see Figure 6(a)). In superposition 
coding, each message at a sender is encoded into a codeword UJ^, and the sender combines (superimposes) all such 
codewords. By merging the combining functions at the sender with the physical channel p{y^\x^), we obtain a 
(i^', L)-DM-IN p{y^\u^') with "virtual" inputs Uk', fc' e [1 : K'], as illustrated in Figure 6(b). 

(Mi,M2,M3)^Xi" 
M4 ^ XJ^ 

{Mk'-uMk')~^ XI 



p{y'^\x^) 



yn 



Ml ^ C/f 

M2 ^ c/2" 

M3 ^ U^' 

Ma -> U2 



piy^\x^) 



yn 



Mk'-I ^ Ul,_^ 

Mk' ^ UV,, 



X 



(a) Multiple messages per sender via superposition coding. (b) Equivalent channel with a single message per sender. 
Figure 6. The class of (K, L)-DM-INs includes superposition coding with an arbitrary number of layers. 



15 



Define the rate region (p) as 

^l{p)= [j ^MAC(5)(P), (13) 
SC[1:K], 

where ^mac{S){p) is the achievable rate region for the multiple access channel from the set of senders <S to 
receiver 1, i.e., the set of rate tuples {Ri, . . . , Rk) such that 

Rt=Y. ^ ^(^r; Yi I Xs\T, Q) for all T C S. 
jeT 

Note that the set ^mac{S){p) corresponds to the rate region achievable by decoding for the messages from the 
senders <S, which contains all desired messages and possibly some interfering messages. Also note that ^mac(S){p) 
contains upper bounds only on the rates Rk, k G S,of the active senders <S in the MAC. The signals from the inactive 
senders in .S'^ are treated as noise and the corresponding rates Rk for fc e are imconstrained. Consequently, 
^i{p) is unbounded in the coordinates Rk for fee [1 : K]\'Di. 

The region ^i{p) in (13) can equivalently be written as the set of rate tuples (i?i, . . . , Rk) such that for all 
U C[l:K]\Vi and for all V with <DcVCVi, 

R-D + + I{Xd\d,;Yi I Xt), Xw , X[i,k]\v\u > Q)} < I{^t>, Xu\ Yi \ X^x:K]\v\U', Q)- (14) 

As in the case of the 2-DM-lC, each argument of each term in the minimum represents a different mode of signal 
saturation. The equivalence between the MAC form (13) and the min form (14) can be proved by identifying the 
largest set of decodable interfering messages as in [17]. For completeness, we provide a proof in Appendix B. 

Remark 6. The MAC and min forms of Sli (p) are duals to each other in the following sense. The condition for 
{Ri, . . . , Rk) e ^i{p) in the MAC form (13) can be expressed as 

3SC[1:K], ViCS: 

Vr C5 : 

Rr<I{Xr;Yi\Xs\r,Q)- (15) 
The conditions in the min form (14) can be rewritten^ as 

WC[1:K], VnPi^0: 

3V' C V, V' n Di = V n Di : 

Rv <IiXv';Yi\X[i.,K]\v,Q)- (16) 
Both conditions involve a set of messages from the senders S (or V) and its subset T (or V'), and impose a mutual 

'To see this, first note that the minimum terms on the left hand side of (14) represent a set of conditions of which at least one has to be true, 
then use the identity 

I{Xt>,Xu;Yi I Xii,K]\j,\i(, Q) - 7(Xj^\j^/; Yi | X-D,X[^/,X[i.jf]\^\;^, Q) = I {X-o , Xjji -,¥1 \ Xy^,iiT^\xi\u^ Q)i 
and finally, let V = W U O and V = W U 
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information upper bound on the sum rate over the subset. The key difference is the order of the quantifiers V and 3. 

Analogous to ^i{p), define the regions ^2{p), ■ ■ ■ , ^l{p) for receivers 2, . . . , L by making appropriate index 
substitutions. We are now ready to state the main result for the {K, L)-DM-IN. 

Theorem 2. Given a pmf p = p{q)p{x\\q) ■ ■ ■p{xk\(i), the optimal rate region of the (iiT, L)-DM-IN p{y^\x^) 
with demand sets Vi, . . . achievable by the p-distributed random code ensemble is 

^*{p)= fi m{p). 

Note that, as for its 2-DM-IC counterpart, this region is not convex in general. 
Example 1. Consider the ii'-user-pair Gaussian interference network 

K 

Yi = Y,9kiXk + Zu l&[l:K], 
fc=i 

where Zi N(0, 1) and gu are channel gains from sender k to receiver I. Assimie the Gaussian random code 
ensemble with Xk ~ N(0, 1), fc e [1 : K]. The optimal rate region achievable by this random code ensemble was 
established in [17] and [2], and can be recovered from Theorem 2 by letting K = L, = {k} for k G [1: K], and 
applying the discretization procedure in [10, Section 3.4]. Theorem 2 generaUzes this result in several directions, 
since (a) it applies to non-Gaussian networks, (b) it applies to non-Gaussian random code ensembles (which is crucial 
to analyze the performance under a fixed constellation), and (c) it includes coded time sharing and superposition 
coding. 

Example 2. Consider the deterministic interference channel with three sender-receiver pairs (3-DIC) [3], where 

Yl = /l(5ll(^l),/ll(52l(X2),53l(^3)), 
Y2 = /2(522(X2),/l2(532(X3),5l2(Xi)), 

Ys = M933{Xs),hs{gi3{X,),g23iX2)) 

for some loss functions gki and combining functions hk and fk, k,l £ {1,2,3}. The combining functions are 
supposed to be injective in each argument. This setting is of interest since it contains as special cases the El Gamal- 
Costa two-user-pair interference channel [11], for which the Han-Kobayashi coding scheme achieves the capacity 
region, and the Avestimehr-Diggavi-Tse q-ary expansion deterministic (QED) interference channel [1], which 
approximates Gaussian interference networks in the high-power regime. The 3-DIC is an instance of a (K, L)-DM-IN 
with L = K = 3 and Vk — {k} for fee [1 : J^]- The interference decoding inner bound on the 3-DIC capacity region 
in [3] coincides with the region in Theorem 2 in its min form. Beyond the results in [3], Theorem 2 establishes 
thai the interference decoding inner bound is in fact optimal given the codebook structure. Note that for the 3-DIC 
channel, we can identify each minimum term with a specific signal in the channel block diagram for which the term 
counts the number of distinguishable sequences. 
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Proof of Theorem 2: We focus only on receiver 1 for which M^, k e r>i, are the desired messages and M^, 
fc e X>J = [1 : i^] \ Di, are interfering messages. AchievabiUty is proved using simultaneous nonunique decoding. 
Receiver 1 declares that mx>i is sent if it is the unique message tuple such that 

{q"" ,x'i,^{rhv,),x'i).{rnvi),yi) G 7;^"^ for some mpj, 

where x%^{m'D^) is the tuple of x'^{mk), k € Vi, and similarly, x^c(rnx)c) is the tuple of x^{mk), k € V^. The 
analysis follows similar steps as in Subsection II-A. 

To prove the converse, fix a pmf p and let . . . , Rk) be a rate tuple that is achievable by the p-distributed 
random code ensemble. We need the following generalization of Lemma 1, which is proved in Appendix C. 

Lemma 2. If Di C 5 C [1 : Jf], then 

Jim I X2,Cn) = H{Y, I X[,.,K],Q) + ^%{Ru + /(X(5uw)c; Fi | Xsuu,Q))- 

We now estabUsh (14) as follows. Fix a subset of desired message indices, V C Vi, and a subset of interfering 
message indices, U CV^. Then 

n{Rv-Sn)<I{Xi;Y,"\Cn) 

< I{X^;Y", X^-p^jjj^c I Cn) 

< I{X^; Yi I X'^'£,^i^Y,Cn) 

— H{Y{^ I X^j,^j^Y,Cn) — H{Y^ I XJ^c,Cn) 

(b) 

< nH{Yi I X(^jyuuy , Q) - nH{Yi \ X^i-k] , Q) - n-mm{Ru' + /(^(w-uw')<^ ! I ^u-uw ,Q))+ nSn 
= nI{XT,uu; yr I X^p^uY,Q) - n-mmi^Rw + I{Xu\w\yi \ X(u\u'r,Q)) + nSn, 

where (a) follows by Fano's inequaUty and (b) follows by Lemma 2. This completes the proof of the converse. ■ 

IV. Application to the Han-Kobayashi coding scheme 

We revisit the two-user-pair DM-IC in Figure 1. The best known inner bound on the capacity region is achieved 
by the Han-Kobayashi coding scheme [13]. In this scheme, the message Mi is split into common and private 
messages M12 and Mn at rates R12 and Rn, respectively, such that Ri = R12 + Rii- Similarly M2 is split 
into common and private messages M21 and M22 at rates R21 and R22 such that R2 = R22 + ^21- More 
specifically, the scheme uses random codebook generation and coded time sharing as follows. Fix a pmf p = 
p{q)p{uii\q)p{ui2\q)p{u2i\q)p{u22\q)pixi\uii,ui2,q)p{x2\u2i,U22,q), where the latter two conditional pmfs 
represent deterministic mappings Xi{uii,ui2) and X2{u2i,U22)- Randomly generate a coded time sharing sequence 
q" ~ Yli^iPQili)- For each k,k' G {1,2} and rrikk' G [1 : 2"^'=*='], randomly and conditionally independently 
generate a sequence u^i-iinikk') according to YYi^iPUkk'\Qi'^kk'i\Qi)- To communicate message pair {mu, 17112), 
sender 1 transmits xu = xi{uui, Ui2i) for i G [1: n], and analogously for sender 2. Receiver fc = 1, 2 recovers its 
intended message M/j and the common message from the other sender (although it is not required to). While this 
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decoding scheme helps reduce the effect of interference, it results in additional constraints on the rates for common 
messages. The Han-Kobayashi coding scheme is illustrated in Figure 7. 



Mil ^ f/fi -A 

Mi2 ^ f/i"2 ■ 

M21 ^ [/2"i 
M22 ^ U^2 



ri"^Mii,Mi2,M2i 



-> M21,M22,Mi2 



Figure 7. Han-Kobayashi coding scheme. 



Let ^HK,i(p) be defined as the set of rate tuples (-Rn, i?i2, J?2i, R22) such that 

</(f/ii;>^i|f/i2,f/2i,Q), (17a) 

i?i2 </(t/i2;>l|C/ii,C/2i,Q), (17b) 
i?2i </(C/2i;>l|C/ii,C/i2,Q), (17c) 

i?ll + i?i2 < /(C/11, C/12; Fl I f/21, Q), (17d) 

i?ii+i?2i </(t/ii,t/2i;n|t/i2,g), (17e) 

-Ri2 + i?2i </(C/i2,C/2i;>l|C/ii,Q), (I7f) 
Rii + R12 + R21 < /(C/11, C/12, C/21; Fi I Q). (17g) 

Similarly, define ^hk,2(p) by making the sender/receiver index substitutions 1 2 in the definition of ^hk,i(p)- 
As shown by Han and Kobayashi [13], the coding scheme achieves any rate pair (i?i,i?2) that is in the interior of 

^HK = Pr0j4^2 (^'^UkAp) n ^HK,2(P)^ , (18) 

where Proj4_^2 is the projection that maps the 4-dimensional (convex) set of rate tuples {Rn, R12, R21, R22) into 
a 2-dimensional rate region of rate pairs (_Ri, R2) = (i?ii + -R12, R21 + R22)- 

We are interested in finding the rate region that is achievable by the Han-Kobayashi encoding functions in 
conjunction with the optimal decoding functions. To this end, note that by combining the channel and the deterministic 
mappings as indicated by the dashed box in Figure 7, the channel (C/11, C/12, C/21, C/22) (^1,^2) is a (4, 2)-DM-IN. 
After removing the artificial requirement for each decoder to recover the interfering sender's common message, the 
message demands are Vi = {11, 12} and I?2 = {21, 22}. Moreover, the Han-Kobayashi encoding scheme is in fact 
the p-distributed random code ensemble applied to this network, as defined in Section III. 

Definition 3. The optimal rate region .!%opt achievable by the Han-Kobayashi random code ensemble is defined as 



= Proj4^2 (U^*(^^)) 
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where the union is over pmfs of the form p = p{q) p{uii\q) p{ui2\q) p{u2i\q) p{u22\q) p{xi\uii,Ui2) p{x2\u2i,U22) 
with the latter two factors representing deterministic mappings Xi{uu,ui2) and a;2(u2i, W22), and ^*{p) is the 
optimal rate region achievable by the p(g)p(wii|g)p(ui2|g)p(u2i|9)p(w22|9)-distributed random code ensemble for 
the (4,2)-DM-IN ?/2|wii, U12, W21, '"22) = PYi,y2|Xi,X2(2/i> y2|a;i(uii, ^12), a;2('U2i, W22)) (cf. Definition 2). 

Then Theorem 2 implies the following. 
Corollary 1. ^opt = >^hk- 

The corollary states that the Han-Kobayashi inner bound is optimal when encoding is restricted to randomly 
generated codebooks, superposition coding, and coded time sharing. It cannot be enlarged by replacing the decoders 
used in the proof of (17) with optimal decoders. 

Proof of Corollary 1: Applying Theorem 2 to the definition of ^opt yields 

= Proj4^2 I^U^ib) n^2(p)j , 
where ^i{p) is the set of rate tuples (i?ii, -R12, -R21, -R22) such that 

Rt,<I{UtMUs,\t,,Q) forallTiC^i (19) 
for some Si with {11, 12} C 5i C {11, 12, 21, 22}. Likewise, ^2(p) is the set of rate tuples that satisfy 

Rr,<I{Ur,;Y2\Us,\T,,Q) for all r2 c ^2 (20) 

for some <S2 with {21, 22} C ^2 C {11, 12, 21, 22}. Here, <Si and <S2 contain the indices of the messages recovered 
by receivers 1 and 2, respectively. 

In order to compare <5^opt to (5^hK) recall (17) and (18) and the compact description of (5^hk iti [7] as the set of 



all rate pairs {R\,R2) such that 

Ri<I(UiuUi2;Yi\U2i,Q), (21a) 

R2 < /(C/21, C/22; I2 I Ui2,Q), (21b) 

Ri + R2< I{Un, f/i2, U21; Yi\Q) + /(C/22; Y2 \ U12, U2uQ), (21c) 

Ri + R2< I{Ui2, C/21, C/22; Y2\Q) + I{Un;Yi \ U12, U2i,Q), (21d) 

Ri+R2< I{Un,U2i;Yi \ Ui2,Q) + I{Ui2, (722: Y2 1 1/21, Q), (21e) 

2i?i +R2< /(C/11, C/12, C/21; n I Q) + /(f/ii; n I U,2, U2i,Q) + I{Ui2, U22; Y2 I U2i,Q), (21f) 

Ri + 2R2 < /(C/12, U21, U22; Y2\Q) + /(C/22; Y2 I C/12, C/21, Q) + I{Un,U2i;Y, \ C/12, Q) (21g) 



for some pmf of the form p = p{q)p{uii\q)p{ui2\q)p{u2i\q)p{u22\q)p{xi\uii,ui2)p{x2\u2i,U22), where the 
latter two factors represent deterministic mappings a;i('Uii, U12) and a;2(M2i, W22)- 
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It is easy to see that ^hk C ^opt- Choosing Si = {11, 12, 21} in (19), the resulting conditions coincide with the 
ones in (17), and the constituent sets satisfy the condition ^hk,i(p) ^ ^i{p)- Likewise, choosing ^2 = {12, 21, 22} 
in (20), ^HK,2(p) C ^2{p), and the desired inclusion follows. 

To show that ^opt ^ ^hk, note that conditions (19) and (20) must hold for some Si 3 {11, 12} and ^2 ^ {21, 22}. 
For each of the 16 possible choices of Si and ^2, the resulting rate region is (directly or indirectly) included in 
^HK as follows (see Figure 8). 



\52 

{11,12} 
{11, 12,21} 
{11,12,22} 
{11,12,21,22} 




Figure 8. Different cases of Si and 52 for the region Mopt and the inclusion of the corresponding regions in .^hk- An arrow 
from A to B means that the region achieved by case A is included in the region achieved by case B. 



• If iSi = {11, 12, 21} and ^2 — {21, 22, 12}, we obtain precisely ^hk (depicted as a dashed box in the figure). 

• If 5i {11, 12, 21, 22}, both receivers decode for the messages with indices {21, 22}. This is equivalent to let- 
ting C/^i = {U21, U22), U^2 = 0' and S[ = {11, 12, 21}. A symmetric argument holds if ^2 = {21, 22, 11, 12}. 

• If 5i = {11, 12, 22}, then Si can be replaced by {11, 12, 21} by exchanging the roles of U21 and 1/22- The 
exchange will not affect receiver 2, since the two auxiliary random variables play symmetric roles there. A 
symmetric argument holds if ^2 = {21, 22, 11}. 

• If 5i = {11, 12} and ^2 = {21, 22}, we apply Fourier-Motzkin elimination and arrive at 

Ri<I{Xi;Yi\Q), 

R2<I{X2\Y2\Q). 

This region is a subset of the one in (21) when the latter is specialized to U12 = U21 = 0, Uu = Xi, and 

U22 = X2- 

• If 5i = {11, 12} and ^2 = {21, 22, 12}, Fourier-Motzkin elimination leads to 

i?i </(^i;yi|Q), 

Ri < I{Xi;Yi I U12, Q) + IiUi2;Y2 \X2,Q), 
R2<I{X2;Y2\Ui2,Q), 
Ri+R2< I{Xi-Yi I U12, Q) + /(C/12, ^2; Y2\Q). 
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Again, this region is a subset of the one in (21), namely when the latter is specialized to U21 = and U22 = X2. 
A symmetric argument holds if <Si = {11, 12, 21} and <S2 = {21, 22}. 
This concludes the proof of Corollary 1. ■ 

V. Concluding remarks 

Taking a modular approach to the problem of finding the capacity region of the interference network, we have 
studied the performance of random code ensembles. This result provides a simple characterization of the rate region 
achievable by the optimal maximum likelihood decoding rule and invites more refined studies on the performance of 
random coding for interference networks, such as the error exponent analysis (cf. [14], [19]) and Verdu's finite-block 
performance bounds [24]. 

The optimal rate region can be achieved by simultaneous nonunique decoding, which can be useful in other coding 
schemes such as Marton coding for broadcast channels [16] and noisy network coding for relay networks [15]. 
Although its performance can be achieved also by an appropriate combination of simultaneous decoding (SD) of 
strong interference and treating weak interference as noise (IAN) [2], [5], [17], simultaneous nonunique decoding 
provides a conceptual unification of SD and IAN, recovering all possible combinations of the two schemes at each 
receiver. Indeed, as with "the one ring to rule them all" [22], simultaneous nonunique decoding is the one rule that 
includes them all. 
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Appendix A 
Proof of Lemma 1 

Clearly, the right hand side of the equality is an upper bound to the left hand side, since 

H{Y^ I Xr,C„) < nH{Yr \ X,,Q), 

and 

H{Y{^ I X^,Cn) < H{Y{^, M2 I Xi",C„) 

= nR2+H{Y,"\X^,X^,Cn) 
<nR2 + nH{Yi\Xi,X2,Q), 
where we have used the codebook structure and the fact that the channel is memoryless. 
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To see that the right hand side is also a valid lower bound, note that 




=nH(Yi\Xi,X2) =nR2 
=nH{Yi\Xi.X2^Q) 



Next, we find an upper bound on H{M2 \ X^,Cn, Y^) by showing that given X", C„, and F", a relatively short 
hst £ C [1 : 2"^^] can be constructed that contains M2 with high probability (the idea is similar to the proof of 
Lemma 22.1 in [10]). Without loss of generaUty, assume M2 = 1. Fix an £ > and define the random set 

C = {m2: iQ",X^,X^{m2),Y{^) e T^"^}. 

To analyze the cardinality |£|, note that, for each m2 ^ 1, 

P{(Q", x^, x^{m2), vn e r,(")} = J2 = ^" = ^1 ' ^2(^2) = x^} P{K, x^, e ri")} 

^ 2-»(-f(^2;Y'i|Xi,Q)-5(e))^ 

where (a) follows by the joint typicality lemma. Thus, the cardinality |£| satisfies |£| < 1 + B, where B is a 
binomial random variable with 2"^= — 1 trials and success probability at most 2~"(^(^2;i'i|^i,Q)-'5(e)) jj^g expected 
cardinaUty is therefore boimded as 

E(|>C|) < 1 + 2"(^2-/(X2;Yi|Xi,Q)+5(£))_ (^22) 

Note that the true M2 is contained in the list with high probability, i.e., 1 € £, by the weak law of large numbers, 

P{(g", X^, X^{1), Fi") e ri")} ^1 as n ^ 00. 
Define the indicator random variable E = 1(1 G £), which therefore satisfies P{E = 0} — > as n — > 00. Hence 

H{M2 I X^,Cn, 17) = H{M2 I X^,Cn, n", E) + I{M2; E\X^,Cn, n") 
<H{M2\X'l,Cn,Y^,E) + l 
= 1 + P{i; = 0} ■ H{M2 1 C„, Fi", E = Q) 
+ P{E = 1} • H{M2 I X^,Cn, n", E=l) 
<l + nR2P{E = Q} + H{M2\X'l,Cn,Y{',E=l). 
For the last term, we argue that if M2 is included in C, then its conditional entropy cannot exceed log(|£|): 
H{M2\X-^,Cn,Y[\E= 1) H{M2\X^,Cn,Y,^,E=l,£, \£\) 

<H{M2\E=1,C,\C\) 

= J2p{\J^\ = i}-h{M2\e = i,c, \c\ = i) 
1=0 
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< ^P{|£|=;} -108(0 

1=0 

= E(log(|£|)) 

<log(E(|£|)) 

(c) 

< 1 + max{0, n{R2 - /(X2; Yi\Xi, Q) + 5(e))}, 

where (a) follows since the list C and its cardinality \C\ are functions only of X", C„, and F", (b) follows by 
Jensen's inequality, and (c) follows from (22) and the soft-max interpretation of the log-sum-exp function [6, p. 72]. 
Substituting back, we have 

H{M2 I Xi",C„, Fi") < 2 + ni?2 P{E = 0} + max{0, n(i?2 - /(X2; Fi |Xi, Q) + ,5(e))}, 

and 

\h{YI' I Xi",C„) > i/(ri X2, Q) + i?2 - ^ - i?2 P{E = 0} - max{0, i?2 - /(^2; Fi Q) + ,5(e)} 
> i?(Fi X2, Q) + min{ii!2, /(X2; Fi Q) - 5{e)} - ^ _ i?^ p{£ = Q}. 

Taking the limit as n ^ 00, and noting that we are free to choose e such that 5{e) becomes arbitrarily small, the 
desired result follows. 

Appendix B 

Equivalence between the Min and MAC Forms 

Fix a distribution p = p{q)p{x\\q) ■ ■ ■p{xk\q) and a rate tuple (i?i, . . . , Rk)- We show that the conditions (15) 
and (16) are equivalent. 

Proof that (15) implies (16): We are given a set <S with Vi CSC [1: K]. Fix an arbitrary V with nonempty 
intersection VnVi. Now consider V = T = Sr\V. Note V n Di = V n Di as required. Then, 

Rv'=Rt<I{Xt;Y,\Xs\t,Q) 

(b) 

< I{Xr; Fi I ^5\V)-'^[i:/s-]\5\v> Q) 
= I{Xv>;Yi \X[i,K]\v,Q), 

where (a) follows from (15), and (b) follows from the structure of p. ■ 
Proof that (16) implies (15): Denote a set <S C [1 : K] as decodable if 

Vr c 5 : Rr<I{Xr;Y,\ Xs\r, Q) ■ 
Then the following proposition holds, which is proved below. 

Proposition 1. If Si and ^2 are decodable sets, then Si U ^2 is a decodable set. 
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To determine which messages are decodable, consider the optimization problem of maximizing \S\ over decodable 
sets S. From Proposition 1, a unique maximizer S* must exist, which is a superset of all decodable sets. Consider its 
complement S . The intuitive reason for the messages indexed by S being Mwdecodable is that the corresponding 
rates are too large. This notion is made precise in the following proposition, which is analogous to a property for 
the Gaussian case given in [2, Fact 1] and for which a proof is provided below. 

Proposition 2. For all sets U with C W C 5*, the rates satisfy 

Ru> I{Xu;Yi\Xs*,Q). (23) 

Assuming (15) is not true, there must be some desired message index that is not decodable, i.e., Vi ^ S*, or 
equivalently, S* r\T>i ^ 0. Then we can choose V = 5* in (16), yielding 

3V CS*y nVi=S* nVi -. < IiXv';Yi\Xs^,Q), 

which contradicts (23). This proves that (16) implies (15). ■ 
Proof of Proposition 1: Since Si and ^2 are decodable, we have 

Rr < I{Xr; Yi | X^^^r, Q) for all T C 5i , 

Rt' < I{Xt' ; Yi I Xs,\T' , Q) for all T' QS2. 

and we need to show 

Rt" < I{Xt" ; Yi I X(5^u5.)\r" , Q) for all T" C 5i U 52. 

Fix a subset T" C 5i U ^2 and partition it as T" = T/' U Ti' where T/' C Si, Ti' C ^2, T/' n Ti' = 0, and 
Ti' n 5i = (see Figure 9). 




Figure 9. Partitioning the set T" C 5i U 52. 



Then 

i?7"" = i?7"^" + Rt" 

< i{Xr^, ; Yi I Xs,\T(' , Q) + i{Xri' ; ^ I ^^at," , Q) 

< l{Xr{' ; Yi I X(^Sius2)\T" - Q) + -^(-^r^" ; Yi \ ^(5iu52)\r" ^ ^r;' . Q) 
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which concludes the proof. ■ 

Proof of Proposition 2: Assume first that the proposition was not true. Then there must be a minimal U with 
$ CU CS* such that 



Ru <HXu;Y,\Xs^,Q), 
Ru\T > I{Xu\r; Yi I Xs* , Q) for all T with %(ZT <ZU. 



Now, 



Rt = Ru - Ru\T < I{Xu;Yi \ Xs^ , Q) ~ I{Xu\t; Yi \ Xs* , Q) 

= I{Xr\ Fi I Xs^ , Xu\r, Q) for all T satisfying (dcTcU. 
Recalling (24), the last statement continues to hold for T = U. Thus, 

Rt < HXt; Fi | Xs* , Xu\t, Q) for all T QU. 



(24) 



(25) 



We are going to show that S* UU is decodable, which contradicts the definition of S* as the maximum decodable 
set since U is non-empty and does not intersect S*. To this end, consider an arbitrary T' Q S* UU and partition it 
as V = r/ U T2 with T{ n T2 ^ 0, T{ C S*, and T2 C U (see Figure 10). 



T' 













Figure 10. Partitioning the set T' C 5* U U. 



Then 

Rt' = Rt( + Rri 

< I{Xr; ; Fi I Xs.\r^ , Q) + I(Xr^ ; Fi | , Xu\r^ , Q) 

(b) 

< liXri ; Fi I Xs.\T{ , Xu\ri ' + ^(^r,' ; Yi \ X^^xr/ , ^^/\r,' , ^r/ , Q) 
= /(^r/ J -'^Ta' ; Yi I ^(5*uw)\(ri'ur2') ) Q) i 

where (a) follows from S* being decodable and (25), and in (b), we have augmented the first mutual information 
expression and rewritten the second one. This concludes the proof by contradiction. ■ 
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Appendix C 
Proof of Lemma 2 

The proof proceeds along similar lines as the proof of Lemma 1. First, we show that the right hand side is a 
vaUd upper boimd to the left hand side. For any U C S^, 

H{Y^ I X2,Cn) < H{Y^, Mu I X^,Cn) 

= nRu+H{Y{'\XS,Xl^,Cn) 

<nRu+nH{Yi\Xs,Xu,Q) 

= nRu+ nH{Yi \ X[i^^], Q) + /(^(^uw)^; ^ I ^5uw, Q), 

where we have used the codebook structure. 

To see that the right hand side is a vahd lower bound to the left hand side, note 

H{Y^\X2,Cn) = nH{Y^\X^^.,K^,Q) + nRsc - H{Msc \X2,Y{',Cn). 
Without loss of generality, assume = 1, for k G S'^. Fix an £ > and define the random set 

£ = {m^c : (Q",Xf|,eD,,Xf (mOI.eDj,!?) G T^^ with nik = 1 for all k G V^nS}. 

To analyze the cardinahty |£|, fix a mso and consider the probability of ms^ G £. If ruk ^ 1 for all k G S'', and 
rrifc = 1 otherwise, then the joint typicaUty lenmia imphes 

and there are at most 2"^^" such m^c. More generally, fix a subset U C iS'^. If nik ^ I ioi k £ S^\U, and nik = 1 
otherwise, then 

and there are at most 2"^'S'=\" such msc Thus, 

E(|£|) < 2"'''^'5°\"~'^^'^'^°\"'^^l'^'^'^"''^'"'"'^^^^\ (26) 

Define the indicator random variable E = I((l, 1, . . . , 1) G £), which satisfies P{£^ = 0} ^ as n — )• oo by the 
weak law of large numbers. Now 

F(M5c I X^, Y^,Cr,) < 1 + nRsc P{E = 0} + H{Ms^ \ X^, Yl\C^, E = 1). 

For the last term, we argue 

H{Msc\X^,Y^,en,E=l) < log(E(|£|)) 

< um^{n{Rs^\u - I{,Xs^\u\Y^\Xs.Xu,Q) + 5{e))) + 
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Substituting back, 

H{Ms^ I X^, Y^,Cn) < 1 + + ni?5c P{E = 0} + max {n{Rs^\u " ^(^5=^; Y^\Xs,Xu, Q) + 5{e))), 
and finally, 

^H{Y^ i X^,Cn) > H{Y,\X^,.,K],Q) + Rs' - - Rs^ P{E = 0} 

- m£« {Rs.\u - I{Xs.\u;Yi\Xs,Xu, Q) + (5(e)) 
= H{Yi\ , g) - i^l^ - i?5c P{^ = 0} 

+ min {Ru + I{Xsc\u\Y^\Xs,Xu,Q) + 5{e)) . 
Taking the limits n — )■ oo and e ^ concludes the proof. 

References 

[1] A. S. Avestimehr, S. N. Diggavi, and D. N. C. Tse, "Wireless network information flow: A deterministic approach," IEEE 

Trans. Inf. Theory, vol. 57, no. 4, pp. 1872-1905, Apr. 2011. 
[2] F. Baccelli, A. El Gamal, and D. N. C. Tse, "Interference networks with point-to-point codes," IEEE Trans. Inf. Theory, 

vol. 57, no. 5, pp. 2582-2596, May 2011. 
[3] B. Bandemer and A. El Gamal, "Interference decoding for deterministic channels," IEEE Trans. Inf. Theory, vol. 57, no. 5, 

pp. 2966-2975, May 2011. 

[4] P. P. Bergmans, "Random coding theorem for broadcast channels with degraded components," IEEE Trans. Inf. Theory, 

vol. 19, no. 2, pp. 197-207, Mar. 1973. 
[5] S. S. Bidokhti, V. M. Prabhakaran, and S. N. Diggavi, "Is non-unique decoding necessary?" in Proc. IEEE Int. Symp. Inf. 

Theory, Boston, MA, Jul. 2012. 

[6] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004. 

[7] H.-F. Chong, M. Motani, H. K. Garg, and H. El Gamal, "On the Han-Kobayashi region for the interference channel," IEEE 

Trans. Inf. Theory, vol. 54, no. 7, pp. 3188-3195, Jul. 2008. 
[8] T. M. Cover, "Broadcast channels," IEEE Trans. Inf Theory, vol. 18, no. 1, pp. 2-14, Jan. 1972. 

[9] , "An achievable rate region for the broadcast channel," IEEE Trans. Inf. Theory, vol. 21, no. 4, pp. 399^04, Jul. 1975. 

[10] A. El Gamal and Y.-H. Kim, Network Information Theory. Cambridge University Press, 2011. 

[11] A. A. El Gamal and M. H. M. Costa, "The capacity region of a class of deterministic interference channels," IEEE Trans. 

Inf Theory, vol. 28, no. 2, pp. 343-346, Mar. 1982. 
[12] A. A. Gohari, A. El Gamal, and V. Anantharam, "On Marlon's inner bound for the general broadcast channel," in Proc. 

IEEE Int. Symp. Inf Theory, Austin, TX, Jun. 2010. 
[13] T. S. Han and K. Kobayashi, "A new achievable rate region for the interference channel," IEEE Trans. Inf Theory, vol. 27, 

no. 1, pp. 49-60, Jan. 1981. 

[14] E. A. Haroutunian, "Lower bound for the error probability of multiple-access channels," Probl. Peredachi Inf., vol. 11, 
no. 2, pp. 23-36, Jun. 1975. 

[15] S. H. Lim, Y.-H. Kim, A. El Gamal, and S.-Y. Chung, "Noisy network coding," IEEE Trans. Inf Theory, vol. 57, no. 5, pp. 
3132-3152, May 2011. 

[16] K. Marton, "A coding theorem for the discrete memoryless broadcast channel," IEEE Trans. Inf Theory, vol. 25, no. 3, pp. 
306-311, May 1979. 



28 



[17] A. S. Motahari and A. K. Khandani, "To decode the interference or to consider it as noise," IEEE Trans. Inf. Theory, 

vol. 57, no. 3, pp. 1274-1283, Mar. 2011. 
[18] C. Nair and A. El Gamal, "The capacity region of a class of three-receiver broadcast channels with degraded message sets," 

IEEE Trans. Inf. Theory, vol. 55, no. 10, pp. 4479-4493, Oct. 2009. 
[19] A. Nazari, A. Anastasopoulos, and S. S. Pradhan, "Error exponent for multiple-access channels: Lower bomids," IEEE 

Trans. Inf Theory, Jul. 2010, submitted for pubUcation, arXiv: 1010: 1303. 
[20] A. OrUtsky and J. R. Roche, "Coding for computing," IEEE Trans. Inf Theory, vol. 47, no. 3, pp. 903-917, Mar. 2001. 
[21] C. E. Shannon, "A mathematical theory of communication," Bell Syst. Tech. J., vol. 27, no. 3, pp. 379^23, vol. 27, no. 4, 

pp. 623-656, 1948. 

[22] J. R. R. Tolkien, The Lord of the Rings. Boston: Houghton Mifflin, 1954-1956, 3 volumes. 

[23] E. C. van der Meulen, "Random coding theorems for the general discrete memoryless broadcast channel," IEEE Trans. Inf 

Theory, vol. 21, no. 2, pp. 180-190, Mar. 1975. 
[24] S. Verdu, "Non-asymptotic achievabUity bounds in multiuser information theory," in Proc. 50th Ann. Allerton Conf. Commun., 

Control, Comput., Monticello, IL, Oct. 2012. 



